Predicate-Argument Structure-based Preordering for Japanese-English Statistical Machine Translation of Scientific Papers
نویسندگان
چکیده
Translating Japanese to English is difficult because they belong to different language families. Naı̈ve phrase-based statistical machine translation (SMT) often fails to address syntactic difference between Japanese and English. Preordering methods are one of the simple but effective approaches that can model reordering in a long distance, which is crucial in translating Japanese and English. Thus, we apply a predicate-argument structure-based preordering method to the Japanese-English statistical machine translation task of scientific papers. Our method is based on the method described in (Hoshino et al., 2013), and extends their rules to handle abbreviation and passivization frequently found in scientific papers. Experimental results show that our proposed method improves performance of both (Hoshino et al., 2013)’s system and our phrase-based SMT baseline without preordering.
منابع مشابه
Modeling the Translation of Predicate-Argument Structure for SMT
Predicate-argument structure contains rich semantic information of which statistical machine translation hasn’t taken full advantage. In this paper, we propose two discriminative, feature-based models to exploit predicateargument structures for statistical machine translation: 1) a predicate translation model and 2) an argument reordering model. The predicate translation model explores lexical ...
متن کاملJapanese to English Machine Translation using Preordering and Compositional Distributed Semantics
The pipeline of modern statistical machine translation (SMT) systems consists of several stages, presenting interesting opportunities to tune it towards improved performance on distant language pairs like Japanese and English. We explore modifications to several parts of this pipeline. We include a preordering method in the preprocessing stage, a neural network based model in the tuning stage a...
متن کاملWord-based Japanese typed dependency parsing with grammatical function analysis
We present a novel scheme for wordbased Japanese typed dependency parser which integrates syntactic structure analysis and grammatical function analysis such as predicate-argument structure analysis. Compared to bunsetsu-based dependency parsing, which is predominantly used in Japanese NLP, it provides a natural way of extracting syntactic constituents, which is useful for downstream applicatio...
متن کاملSemantic Mapping Using Automatic Word Alignment and Semantic Role Labeling
To facilitate the application of semantics in statistical machine translation, we propose a broad-coverage predicate-argument structure mapping technique using automated resources. Our approach utilizes automatic syntactic and semantic parsers to generate Chinese-English predicate-argument structures. The system produced a many-to-many argument mapping for all PropBank argument types by computi...
متن کاملDiscriminative Preordering Meets Kendall's Tau Maximization
This paper explores a simple discriminative preordering model for statistical machine translation. Our model traverses binary constituent trees, and classifies whether children of each node should be reordered. The model itself is not extremely novel, but herein we introduce a new procedure to determine oracle labels so as to maximize Kendall’s τ . Experiments in Japanese-to-English translation...
متن کامل